Measures of Spread

Measures of spread are ways of summarizing a group of data by describing how spread out the values are. If the spread of values in the data set is large, the mean is not as representative of the data as if the spread of data is small. This is because a large spread indicates that there are probably large differences between individual data points.

Picture

The spread of the values can be measured for quantitative data, as the variables are numeric and can be arranged into a logical order with a low end value and a high end value. Measures of spread are used in conjunction with a measure of central tendency, such as the mean or median, to provide an overall description of a set of data. To describe spread, a number of statistics are available, including the range, quartiles and standard deviation. Which you use will depend on how much and the type of data you collected.

Picture

What Measure of Central Tendency Did you Calculate?

Mean with 5 or more trials per level of manipulation

Standard Deviation

Mean with 4 or less trials per level of manipulation

Range

Median with 5 or more trials per level of manipulation

Quartile

Median with 4 or less trials per level of manipulation

Range

Standard Deviation

Averages do not tell us everything about a sample.  Samples can be very uniform with the data all bunched around the mean (Figure 1) or they can be spread out a long way from the mean (Figure 2). The statistic that measures this spread for normally distributed data is called the standard deviation. The wider the spread of scores, the larger the standard deviation.

Picture

For data that has a normal distribution, 68% of the data lies within one standard deviation of the mean.

Picture

How to Calculate the Standard Deviation:  ​

Picture

  1. Calculate the mean () of a set of data​
  2. Subtract the mean from each point of data to determine (x-).  You'll do this for each data point, so you'll have multiple (x-).
  3. Square each of the resulting numbers to determine (x-)^2.  As in step 2, you'll do this for each data point, so you'll have multiple (x-)^2.
  4. Add the values from the previous step together to get ∑(x-)^2.  Now you should be working with a single value.  
  5. Calculate (n-1) by subtracting 1 from your sample size.  Your sample size is the total number of data points you collected.
  6. Divide the answer from step 4 by the answer from step 5
  7. Calculate the square root of your previous answer to determine the standard deviation.
  8. Be sure your standard deviation has the same number of units as your raw data, so you may need to round your answer.  
  9. The standard deviation should have the same unit as the raw data you collected.  For example, SD = +/- 0.5 cm.  

Calculating the Standard Deviation in Sheets

Using Excel to Calculate the Standard Deviation

  • Use Excel to calculate the mean of your data.
  • Click on the box in which you want the Standard Deviation to be placed  
  • Click the "Formulas" tab at the top of the screen
  • Select the “Insert Function button”​
  • Search to find the STDEV option, click OK​
  • Highlight the data of which you want the SD to be calculated, click OK.  Be sure not to select the mean as one of  your data points for calculating standard deviation.  This is a common mistake.
  • Once you have the mean and standard deviation, you need to make sure that you set the values to the correct number of digits.  EXCEL will default to giving you too many numbers after the decimal place.  Your mean and standard deviation must have the same precision (number of digits after the decimal) as your data points.  To do this, click the box which is displaying the standard deviation and on the "Home" tab click the decrease decimal button until you have the correct number of digits showing.

Quartiles

Quartiles  divide an ordered dataset into four equal parts, and refer to the values of the point between the quarters. Quartiles are a useful measure of spread because they are much less affected by outliers or a skewed data set than the standard deviation. For this reason, quartiles are often reported along with the median as the best choice of measure of spread and central tendency, respectively, when dealing with skewed and/or data with outliers.

A common way of expressing quartiles is as an interquartile range. The interquartile range (IQR) describes the difference between the third quartile (Q3) and the first quartile (Q1), telling us about the range of the middle half of the scores in the distribution.  The IQR is often seen as a better measure of spread than the range as it is not affected by outliers

Picture

Range

Range is the difference between the smallest value and the largest value in a dataset.  Range is used if there are less than 5 trials that are being used to calculate a measure of central tendency..

Picture